Is a naturalistic account of reason compatible with its objectivity?

Can rational objectivism be implemented in a connectionist system (like the brain)?

Greg Detre

Tuesday, January 29, 2002

Dr Tasioulas

Introduction

At root, connectionism amounts to the thesis that the brain is a dynamical system, like a mathematically modellable complex of levers and pulleys, or in this case, neurons and synapses. The high-level behaviour of the system seems to emerge like magic out of a morass of low-level interactions, just as the seemingly-centralised wheeling and coordination of a flock of birds results from each bird paying attention to purely local rules, e.g. the position and speed of its neighbours.

Define connectionism

More specifically, connectionism refers to the family of theories that aim to understand mental abilities in terms of formalised models of the brain. These usually employ large numbers of nodes (neurons), with weighted inter-connections (synapses). The firing rate of a neuron is usually some non-linear function (e.g. sigmoid) of its activity, which is calculated as the weighted sum of the firing rates of neurons that synapse onto it. In this way, activity is propagated over time (milliseconds, in practice) in parallel from the input neurons eventually to the output neurons.

Input neurons are defined as those whose activation is (at least partially) determined by the external environment (in the case of the brain, various sensory receptors), and output neurons are those which affect some change in the system�s behaviour in that environment (e.g. motor neurons connected to muscle) � hidden neurons are those whose activity is invisible to the environment.

What makes neural networks interesting is their ability to self-organise, or �learn�, by modifying their weights according to a learning algorithm. The simplest are the Hebbian-type learning rules[1], which are based on the principle:

the synapse between two neurons should be strengthened if the neurons fire simultaneously

This can be implemented in a pattern-associator, an architecture for associating a set of input patterns with a set of pre-specified output patterns. Innumerable improvements and revisions have been employed, and the Hebbian rule really only works well for orthogonal (i.e. uncorrelated) input patterns, but its human-like robustness and ability to generalise are notable. When presented with a novel pattern which is similar but not identical to a learned input pattern, its output will be similar or identical to the learned output pattern. It can be seen to generalise to new data, and form prototypes based on families of resemblance between input patterns, both of which features had to be explicitly, inelegantly and inefficiently built into previous symbolic models.

Smolensky�s stronger claim of �connectionism�

Before continuing, I want to mention a second, stronger sense in which the term �connectionism� is used as a thesis about the workings of the mind. The stronger claim, as espoused by Smolensky, can be stated negatively: a symbolic, cognitive-level description cannot fully capture (i.e. specify in law-like terms) our mental activity. That is, if we want to fully understand (i.e. account for or predict) the workings of the mind, we cannot talk at the level of psychology, but must (at least partially) descend towards the neural level. Smolensky maintains that a sub-symbolic level consisting of non-semantically evaluable constituents or micro-features of symbols exists, above the neural level, at which we will be able to fully specify (i.e. capture nomologically) mental activity.

If we reject this stronger thesis of connectionism, we are left with the (more or less incontrovertible) physiological evidence that the brain comprises approximately 10¹¹ neurons, linked by about 10¹⁴ synapses which are the main unit of computation. Since a connectionist system can be seen as a Universal Turing Machine, we can (if we choose) simply see the neuronal level as implementing the symbols posited by psychologists and GOFAI researchers.

However, I find Smolensky�s view highly congenial � it seems implausible to me that the labyrinthine workings of the brain can be cleanly distilled down to a manageable number of discrete boxes (or �modules�, in Fodor�s sense), each with an informationally-encapsulated, specific domain/function etc.[2] I will this stronger sense of �connectionism� from now on, since I consider it interesting, powerful and plausible, and only really open to a single extra objection, the systematicity objection.

Main

Why am I discussing connectionism?

I have raised the issue of connectionism because its success at explaining and understanding low-level neural phenomena is firmly established, and there is cause for optimism that it will prove a valid paradigm for investigation of much, if not all, of the higher levels of the brain and our behaviour. However, there are a number of a priori concerns that need to be addressed and laid to rest before this optimism can be justified.

Discrete vs analog (probabilistic/statistical???)

The brain is a more or less analog system. It is a dynamical system operating in real time (as opposed to discrete time-steps), based on continuous variables like membrane voltage potential, synaptic weight strength etc. (although admittedly at the atomic level, the quantity of neurotransmitter at a given synapse is discrete, but this is a moot point). It seems intuitive that since the computations being performed by the system are analog, and the outputs also analog, that a neural system could not give discrete responses � at best, the system might respond with a very high tendency in one direction or another, but the neurons are not binary, and do not give �true� or �false� answers, only high or low firing rates. As a result, the sort of binary formal logic that mathematicians, logicians and rationalists employ seems inappropriate for such a system. More fundamentally, it seems as though such a system could never be certain, in the way Nagel requires. If it were to turn out that our minds are inherently probabilistic, and could only consider a proposition to be 99.9% true, or infer the correct consequences of a belief most of the time, then reason�s primary position as an ultimately trustworthy source of authority would be fundamentally, irrecoverably undermined.

Fortunately though, our irrationality can not, I believe, be so easily demonstrated. The objection rests on a confusion between the neural and behavioural levels, that is, between the way that individual neurons operate and the way the overall, dynamical system that they comprise operates. A crucial aspect of a connectionist system�s dynamics relates to its non-linearity. The most obvious source of this non-linearity is in the activation function relating neuronal activity (membrane voltage) with firing rate (rate of action potentials produced). As mentioned above, the activation of a neuron can be expressed more or less as the weighted (according to the strength of the synapse) sum of all its inputs. The firing rate is not, however, proportional to this activity. A low activation may produce the occasional lonely action potential. However, as the activity increases, the firing rate will increase non-linearly, up to an asymptote (determined by the bare minimum �absolutely refractory period� between action potentials that a neuron requires to �recharge�, so to speak). This non-linear function could be binary, a threshold linear model, sigmoid or logarithmic[3]. All that matters is that it is not simply linear. This non-linearity gives rise to peculiar dynamics at a high-level, i.e. ensembles of neurons collectively forming a distributed representation, which can begin to seem more and more discrete. We can understand this intuitively if we consider that each neuron will be only slightly activated if its input neurons are not firing vigorously, and so will in its turn hardly fire at all. However, if its input neurons are firing rapidly, its output will be especially high. �Consequently, at a high level, after numerous successive computations have been performed, a more-or-less binary output could easily result.

It should of course be noted that the real situation in the brain is considerably more complicated than has been outlined here. The brain makes use of graded and patterned firing rates, rather than simply treating the signal as a mean or �rate� code over a short period of time, and so incorporating the possible informational content of temporal synchronisation, e.g. as employed in sound localisation. All consideration of inhibitory neurons, neurons with spontaneously high firing rates, the effects of random noise, and competing or inhibitory modules etc. has been stripped from the account to make the essential point that the brain can be considered to work in a discrete way at a high level.

Generality (rationality seems to be able to work in more or less any domain)

I want now to discuss a deeper concern: to what extent could a connectionist system be as general in its domain of applicability as Nagel�s rationality requires?

Computational models have demonstrated that simple logic gates (like AND or OR) can be easily simulated by neural networks. Indeed, much more complicated functions can be replicated too. However, these might be considered to be misleadingly simple cases, since the number of possible permutations is small enough to be contained inside the training set. The system can learn, like a finite state machine, a set of prescribed absolute responses for the given input patterns. This is clearly not an option for most problems. One of the major strengths of a connectionist system is that it can generalise. It forms prototypes from the data, and is able to gauge the similarity between given patterns. As a result, it is able to respond appropriately to novel patterns, and so degrade �gracefully�. Connectionist systems, unlike the programs running on most desktop computers today, are robust. By this, I mean that unexpected, erroneous or corrupt data does not bring the system to its knees. If you feed a neural network damaged or incomplete data, it will settle into the closest attractor available, based on the weight organisation that has arisen from its training.

Rationality, in order to �arrive at principles that are universal and exceptionless � to be able to come up with reasons that apply in all relevantly similar situations, and to have reasons of similar generality that tell us when situations are relevantly similar�, seems perhaps to require too much of a network. In a way, this is an empirically question: �Is the data set to which our brains have been exposed sufficiently broad and representative for us to be able to reason reliably about the areas to which we apply it?� It requires an implausible stretch of the imagination to explain how our senses could provide the data by means of which we could learn to reason mathematically or logically.

This can be seen as another way of asking the same question that led Alfred Wallace (lesser known co-discover of evolution) astray: �why would early man require a brain capable of playing chess, writing poetry . Despite conceiving evolution in more or less the same way as Darwin, and at the same time, Wallace remained a creationist about intelligence because he considered modern man�s intelligence to be superior to that of early homo sapiens (the savage languages �contain no words for abstract conceptions; the utter want of foresight of the savage man beyond his simplest necessities; his inability to combine, or to compare, or to reason on any general subject that does not immediately appeal to his senses�[4]), and indeed to be far beyond what is necessary to sustain such a forager lifestyle. The fact that early and modern man are, at least phylogenetically (that is, as a species), more or less cognitively equals, can be explained in a number of ways. I have tried to cover the most important reasons why we might have evolved to be rational in the section on evolution, so in this connectionist section I am interested primarily in how it is that our physiology could be understood as implementing this rationality.

The main answer to both similar questions, of the extent to which a connectionist system could be as general in its domain of applicability as Nagel�s rationality requires, and of how a connectionist system originally designed for a forager lifestyle could be capable of playing chess and reasoning formally is plasticity.

At this point, we have to remember a very obvious point: people�s reasoning improves with time. This is partly through the basic genetic and developmental processes that govern our improved hand-eye coordination through youth, or puberty, for example. However, as evidenced by the effects of education, human cognitive capacities can be trained in certain directions, allowing us to build enormous pyramid-like conceptual toolkits. Maths is probably the most obvious example. To take a very basic example, we learn what the �addition� operator means through continual, repetitive usage, practicing sums as small children. Over time, somehow, this process �chunks� into a simple, atomic �concept� or automaticised �habit of thought� that we can use unthinkingly when trying to master more complicated concepts which build upon it, e.g. multiplication, or addition of complex numbers.

What we are actually doing is building new, abstract spaces within which we become increasingly adept at operating[5]. We see this process going on every day � when we learn a new word, there is an acclimatisation period where it becomes necessary to reiterate the definition every time we encounter the word, but through usage and repeated encounters, it nestles into our vocabulary web. Variants of this process are going on when we learn new languages, mathematics, formal logic, analytic philosophical reasoning etc. Much more is going on than simply learning new words � we are creating new domains within which certain mental operations are easy or appropriate, just as it can be easier to express one idea in one language than another, or through an image rather than words. These domains piggyback upon and inter-weave with each other. Certainly, we shouldn't expect to see obvious delineations at the neural level.

If we want to better understand the development of abstract spaces, there are various simple basic cases to consider. For example, each sensory modality gives us a different way of perceiving the world, in a trivial and a non-trivial sense. If there is any doubt that the brain is sufficiently plastic to self-organise entirely new domains of thought on the fly, there is plenty of experimental evidence that we are able to, especially for language and the somatosensory system, especially as children.

In neurophysiological terms, it is clear that our brains undergo various genetically-timed stages of progression, especially during our earliest years, initially forming an enormous profusion of synaptic connections that are subsequently pruned. This is not what I am really referring to � as we progress through education, even long beyond the point at which our brains are undergoing developmental (i.e. internally-prescribed) changes, our ability to reason improves. We are continually forming new conceptual spaces, and this improvement is incremental. This is related to the reason that maths, for instance, requires an element of trudging practice that cannot be avoided. An essential part of learning a new theory or technique is practicing it, repeatedly, with different problems. In this way, we are expanding our set of training data to be more representative of a given problem domain, and in the process expanding the generalisation ability of our reasoning. This is exactly what philosophers do when they read each other�s work, expanding their training data.

Of course, this makes one rather big assumption � it assumes that philosophy, or even the separate areas of philosophy, can in some way be divided up into domains within which exposure to �training data� (i.e. the literature), is helpful.

Penrose � non-computable human mental functioning

Perhaps the broadest criticism of all such approaches stems from Godel's theorem, most famously advocated with relation to the mind-body problem by Lucas, and more recently, Penrose[6]. Godel's theory states that in a formal system of above a certain complexity, there will always be formally-undecidable, true propositions, i.e. statements that are true, but which cannot be proved within the system. This thwarted attempts like Russell and Whitehead's Principia Mathematica to found the whole of mathematics on a minimal set of principles (axioms). It also poses problems for connectionist systems. Part of the appeal of a connectionist system is that it can be seen as a Universal Turing Machine. Consequently though, formally non-computable functions cannot be implemented finitely by such a system. Penrose argues that the brain (i.e. people) *can* do this, and so our minds must be more than Turing machines.

Of course, Penrose is himself a physicist and mathematician foremost, and so very much a believer about reason�s objectivism. He proposes that there must be more going on in the brain than we're currently aware of at the sub-neural level - he speculates that there may be quantum effects in microtubules in the brain that allow us to perform non-computable functions, that allow us to see outside the system, where even a very fast machine (such as Deep Blue) would flounder and fail to make the meta-inference.

This is a tricky area to discuss, since Penrose�s extensive proof would be the initial point of attack, but the mathematics are beyond the scope of this paper. As a result I intend to skirt the issue. However, it is worth noting that there is very little evidence to support Penrose�s substantive claims (about the quantum micro-tubules) and that the issue of human fallibility may complicate the picture of the brain as a normal formal system.

If Penrose were to prove right, then almost all of the debate currently centring around the capabilities of purely connectionist systems becomes almost irrelevant, because the nature of such a quantum system would probably be unimaginably different and more powerful. If anything, I would be more tempted to ask what limitations such a hypothetical system would have, and whether our brains are actually much more limited-seeming than one would expect of such a system.

Hard-wired neural representations

Pursuing a greater understanding of our genome, and the way in which it gives rise to a fully-functioning human body and mind is one of the main focuses of current scientific efforts. We have every reason to consider it an area that will eventually yield to standard empirical techniques, laced with the usual input of conceptual imagination and brute computational power that all frontier science requires.

Firstly, we can build up our understanding by considering far simpler genetic mechanisms in simpler organisms first. Moreover, especially with such organisms, we are not bound by ethical constraints, and can experimentally manipulate and so observe the relationship between genotype and phenotype. The situation is more complicated when considering the human genome, since we cannot simply alter a gene here or there and dispassionately note the result. Secondly, the situation is incomparably more complex. Indeed, this is the major difficulty facing us when we try and decode our DNA � unlike planetary bodies and neural networks, DNA may not be formalisable or reducible to a set of underlying dynamic equations. Our scientific toolkit has trouble dealing with irreducibly complex systems, where one component affects another, which in turn has affects a multitude of other genes, each of which has a direct or indirect effect back. This is the case with our genome. Each gene expresses itself in terms of proteins, which have various effects on each other. This is still the low level. At a higher level, the mass of protein interactions results in the cellular system that makes up the human body, including the brain. Perhaps the only way to understand such a system will be to simulate it.

The best understanding available involves an interaction between our genotype and our environment, which results in our eventual phenotype. That is, the way we are and the way our body becomes is an interaction of our genes and the experiences we have. In terms of neural development, this can be expressed in terms of internal (developmental) and external (learning) processes. The �nature-nurture� debate has thus decomposed into a discussion of the relative proportions with which either factor contributes, and under which circumstances. This duality is usually termed �interactionism�.

Tooby and Cosmides� notions of ecological rationality, speculating that we are genetically hard-wired to be cognitively well-suited to certain domains of action and Nozick�s stronger notion of chains of reasoning that have become automaticised to seem �self-evident� are requiring our genes to be able to quite precisely specify neural representations for such ideas and behaviour. There is some debate about the degree of control our genes could have over low-level synaptic organisation, or whether in fact the neural constraints are very broad, determining only architectural or timing parameters perhaps[7]. This is ultimately an empirical issue, but one that is unlikely to be categorically settled for a considerable time. It seems implausible to me though that our DNA would code for such regularities as the assumption of other minds, or of an external world), but the possibility cannot be dismissed. This certainly makes things easier for Tooby & Cosmides, and for Nozick.

Systematicity (cf Chomsky�s argument about combinatorial explosion etc.???)

The problem of systematicity relates specifically to the thesis that the sub-symbolic level is the highest level at which the workings of the mind can be fully specified.

When we reason, or indeed form a sentence, we relate a series of symbols (whether words, propositions, names etc.???) inter-changeably together by syntax � although connectionist models can be trained to be systematic, they can also be trained, for example, to recognize �John loves Mary� without being able to recognize �Mary loves John� (the problem of �systematicity�).

Although in principle, this seems like quite a telling objection, it need not be. Smolensky proposes one solution. Effectively, it involves a distributed representation composed of pairs of neurons (or more likely, pairs of mini-ensembles). One of the pair specifies the content (e.g. the word), and the other specifies the role being played (i.e. the position in the sentence). A string of such pairs could thus specify both:

(loves, 2) (John, 1) (Mary, 3) = John loves Mary

(loves, 2) (John, 3) (Mary, 1) = Mary loves John

Admittedly, this solution is inelegant, probably impractical and inflexible, and biologically implausible, but it does neatly settle the central issue raised by Fodor, of how a formal syntax could be implemented in a distributed connectionist system. Fodor�s attack would be more problematic for early straw man ideas of lexical processing which hypothesised separate, synchronised lists of words categorised by part of speech, for instance.

How can connectionism inform our understanding of rationality?

I am going to contend that some variant on these claims will remain the dominant way of thinking about the mind and brain for the foreseeable future, and that this should inform our understanding of rationality in a number of ways. To some degree, adherence to this picture narrows down what we can be capable of as connectionist-implemented rationalists - most notably, it serves as a constant reminder of our finitude (see my discussion of Cherniak�s �minimal rationality� below). At the same time though, it may helpfully flesh out our conception of ourselves as rational beings, partly by restricting or constraining the number and type of possible explanations, and partly by providing a good idea of the sort of properties we should expect to find.

Hopefully, considering ourselves as connectionist-rationalists might give us a new approach to the problem of alternate rationalities (i.e. 'conceptual schemes'). I am thinking of the low-level differences between the brain of every human on the planet, despite being very similar macroscopically. In terms of the actual computation being performed, nobody thinks in exactly the same way. It is an empirical question how similar our brains are - but it is certainly clear that mapping an area from one brain to the corresponding location in another brain is far from easy (as neuroimaging researchers constantly find). It may be that these differences amount to more or less identical computational processes at a higher level. One might imagine such functionally irrelevant differences as being analogous to the difference between, say, 2(a + b) and 2a + 2b.

Perhaps, if we were able to say how people�s brains differ in terms of the computations being performed, we might eventually begin to trace a broad schema of computational approaches which qualify as rational, to a greater or lesser degree. In fact, a growing number of approaches seek an understanding of the mind in terms of numerous interacting components, moving away from the �monolithic internal models, monolithic control, and general purpose processing� of �classical AI� (Brooks et. al (MIT), Dennett�s multiple drafts, Fodor�s modules). I will discuss some ideas for how these interacting cognitive components comprising �rationality� might be taxonomised.

[1] Hebb, D.O. (1949). The organization of behavior. New York: Wiley

[2] Fodor, Modularity of mind. Fodor defines a �module� in terms of nine features, of which I have mentioned two of the most important. The others are: mandatory; central systems have limited access to the representations computed by input systems; fast; informationally encapsulated; input systems have "shallow" outputs; associated with fixed neural architecture; exhibit characteristic and specific breakdown patterns; their ontological development exhibits a characteristic pace and sequencing.

[4] Wallace, cited in Pinker, How the mind works